Generic Quantum Fourier Transforms 



Cristopher Moore Daniel Rockmore 

Department of Computer Science Department of Mathematics 

University of New Mexico Dartmouth College 

moore@cs . unm . edu rockmore @ cs . dart mouth . edu 



Alexander Russell 
Department of Computer Science and Engineering 
University of Connecticut 

acr@cse.uconn.edu 



February 1, 2008 



Abstract 



The quantum Fourier transform (QFT) is the principal algorithmic tool underlying most efficient quantum algo- 
rithms. We present a generic framework for the construction of efficient quantum circuits for the QFT by "quantizing" 
the separation of variables technique that has been so successful in the study of classical Fourier transform compu- 
tations. Specifically, this framework applies the existence of computable Bratteli diagrams, adapted factorizations, 
and Gel'fand-Tsetlin bases to offer efficient quantum circuits for the QFT over a wide variety a finite Abelian and 
non-Abelian groups, including all group families for which efficient QFTs are currently known and many new group 
families. Moreover, the method gives rise to the first subexponential-size quantum circuits for the QFT over the linear 
groups GL^(^), SL^icj), and the finite groups of Lie type, for any fixed prime power q. 

1 Introduction 

Peter Shor's spectacular application of the Fourier transform over the cyclic group Z„ in the seminal discovery of an 
efficient quantum factoring algorithm 1 25 1 has motivated broad interest in the problem of efficient quantum computa- 
tion over arbitrary groups (see, e.g., I.3..9. .1 lill3lll4lEDi.21..27J '). While this research effort has become quite ramified, 
two related themes have emerged: (i.) development of efficient quantum Fourier transforms and (ii.) development of 
efficient quantum algorithms for the hidden subgroup problem. The complexity of these two problems appears to relate 
intimately to the group in question: while quantum Fourier transforms and hidden subgroup problems over Abelian 
groups are well-understood, our understanding of these basic problems over non-Abelian groups remains embarrass- 
ingly sporadic. Aside from their natural appeal, this line of research been motivated by the direct relationship to the 
graph isomorphism problem: an efficient solution to the hidden subgroup problem over the (non-Abelian) symmetric 
groups would yield an efficient quantum algorithm for graph isomorphism. 

Over the cyclic group Z„ the quantum Fourier transform refers to the transformation taking the state 



where / : Z„ ^ C is a function with ||/I|2 ~ 1 and /(co) = I^^/(z)e^'^'"^/'' denotes the familiar discrete Fourier trans- 
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where / : G ^ C, as before, is a function with ||/||, = 1 and /((o),j denotes the /, jth entry of the Fourier transform at 
the representation CO. This is explained further in Section|2] 

While there is no known explicit relationship between the quantum Fourier transform and the hidden subgroup 
problem over a group G, all known efficient hidden subgroup algorithms rely on an efficient quantum Fourier trans- 
form. Indeed, it is fair to say that the quantum Fourier transform is the only known non-trivial quantum algorithmic 
paradigm for such problems. 

In this article we focus on the construction of efficient quantum Fourier transforms. Our research is motivated by 
dramatic progress over the last decade in the theory of efficient classical Fourier transforms (see, e.g., B1 151 ISI FTSI 1221 V 
These developments have provided a collection of techniques which, taken together, yield a uniform framework for the 
efficient (classical) computation of Fourier transforms over a wide variety of important families of groups including, 
for example, the finite groups of Lie type (properly parametrized) and the symmetric groups. 

We present here an adaptation to the quantum setting of a wide class of efficient classical Fourier transform algo- 
rithms; namely, those achieved by the "separation of variables" approach. This establishes the first generic quantitative 
relationship between efficient classical Fourier transforms and efficient circuits for the quantum Fourier transform. 

Specifically, we define a broad class of polynomially uniform groups and show 

Theorem 1 If G is a polynomially uniform group with a subgroup tower G — G,„ > • • • > {1} with adapted diameter 
D, maximum multiplicity M, and maximum index I ~ max,[G/ : G,_i], then there is a quantum circuit of size poly(/ x 
D X M X log |G|) which computes the quantum Fourier transform over G. 

This quantifies the complexity of the quantum Fourier transform in exactly the same fashion as does Corollary 3.1 
of L17J in the classical case. We extend this class further by showing that it is closed under a certain type of Abelian 
extension which may have exponential index. 

Together, these results give efficient QFTs — namely, circuits of polylog(|G|) size — for many families of groups. 
These include (i.) the Clifford groups CL„; (ii.) the symmetric groups, recovering the algorithm of Beals (iii.) 
wreath products G I S„ where |G| = poly(n); (iv.) metabelian groups, including metacyclic groups such as the dihedral 
and affine groups, recovering the algorithm of H0yer 1 13|; (v.) bounded extensions of Abelian groups such as the 
generalized quaternions, recovering the algorithm of Piischel et al. |21 1. 

Our methods also give the first subexponential size quantum circuits for the linear groups GLk{q), SLii{q), PGLjt(q'), 
and PSLjt(<7) for fixed prime power q, various families of finite groups of Lie type, and the Chevalley and Weyl groups. 

The paper is structured as follows. Sections|2]and|3lbriefly summarize the representation theory of finite groups, 
the Bratteli diagram, and adapted bases. We give our algorithms in Section |3 along with a list of group families for 
which the provide efficient circuits for the QFT. We conclude with open problems in Section|5| 

2 Representation theory background 

Fourier analysis over a group G involves expressing arbitrary functions / : G ^ C as linear combinations of specific 
functions on G which reflect the group's structure and symmetries. If G is Abelian, these are precisely the characters 
of G (the homomorphisms of G into C). For a general group, they are the irreducible matrix elements, and the Fourier 
transform is the change of basis from the basis of delta functions to the basis of irreducible matrix elements. 

In order to be precise we need the language of (finite) group representation theory (see, e.g., Serre 1241 for an 
excellent introduction). A representation p of a finite group G is a homomorphism p : G U(y ), where V is a (finite) 
(ip -dimensional vector space over C with an inner product and \J{V) denotes the group of unitary linear operators on 
V. Fixing an orthonormal basis for V, each p(g) may be realized as a t/p x dp unitary matrix. When a basis has been 
selected in this way for V, we refer to p as a matrix representation of G; then each of the dp functions Pij{g) — [p{g)]ij 
is called & matrix element (corresponding to p). As p is a homomorphism, for any g,h E G, p{gh) = p{g)p{h), implying 

that in general, Pij{gh) = if/L^ Pik{g)pkj{h). 

A matrix representation p of G on V is irreducible if no subspace (other than the trivial {0} subspace and V) is 
mapped into itself. This is equivalent to the statement that there is no change of basis that finds a simultaneously 
block diagonalization (of given shape) of all p{g). Otherwise the representation is said to be reducible. The irre- 
ducible representations will play a role in the theory analogous to that of the characters of an Abelian group. Two 
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representations p and o are equivalent if they differ only by a change of basis, so that for some fixed unitary matrix U , 
G{g) = U^^G{g)U, for all g e G. Up to equivalence, a finite group G has a finite number of irreducible representations 
equal to the number of its conjugacy classes. For a group G, we let G denote a collection of representations of G 
containing exactly one from each isomorphism class of irreducible representations. 

Selecting bases B for the representations of G results in a set of (inequivalent irreducible) matrix representations; 
when we wish to be explicit about this selection of bases, we denote such a collection Gb- The matrix elements of 
the matrix representations p E Gb in fact form an orthonormal basis for the |G| -dimensional vector space of complex- 
valued functions on G. This implies the important relationship between the dimensions of the irreducible representa- 
tions of Gand |G|: LpeG'^p ~ 1^1- Such a family gives rise to a general definition of Fourier transform. 

Definition 1 Let f : G ^ C; let p : G ^ U{V) be a matrix representation of G. The Fourier transform of / at p, 
denoted /(p), is the matrix 

V 1^1 geG 

We typically restrict our attention to /(p), where p is irreducible. 

We refer to the collection of matrices (/(p))pgGg the Fourier transform of f. Thus /is mapped into |G| matrices 

of varying dimensions. The total number of entries in these matrices is Y,d^ = |G|, by the equation mentioned above. 
The Fourier transform is linear in /; with the constants used above {^dp/\G\) it is in fact unitary, taking the |G| 
complex numbers {f{g))g£G to |G| complex numbers organized into matrices. 

For two complex-valued functions f\ and /2 on a group G, there is a natural inner product {f\.,f2) given by 
J5J I^s/i is).k{K)* ■ For any pair of matrix representations p,0 G Gb, the corresponding irreducible matrix elements 
are orthogonal according to the inner product: let p and a be two elements of G; then 

([P(-)1«.[.(*> = {°,5^,^, lltl (!) 

Computation of the Fourier transform (with respect to a given choice of G) is equivalent to the change of basis 
from that of the point masses to the irreducible matrix elements determined by G. This Unear map (of the vector space 
of functions on G) is invertible, with (point-wise) inverse given by the Fourier inversion formula: 

/W=I\&(p(.)/(p)-i). 

A reducible matrix representation p : G ^ U{V) may always be decomposed into irreducible representations; 
specifically, there is a basis of V in which each p{g) is block diagonal where the ith block of p{g) is precisely for 
some irreducible matrix representation a, . In this case we write p = C;. The number of times a given a e G appears 
in this decomposition is the multiplicity of a in p. If the irreducible representation a,- appears with multiplicity w,- in 
decomposition of p, we may write p = ©"''ai . . . ©"''^ Or. 

A representation p of a group G is also automatically a representation of any subgroup H. We refer to this restricted 
representation onH as, p\jj. Note that in general, representations that are irreducible over G may be reducible when 
restricted to H. 

Note: The familiar Discrete Fourier Transform (DFT) corresponds to the case in which the group is cyclic. In this 
case the representations are all one dimensional, and if G = Z„, the linear transformation (i.e., the Fourier transform,) 
is an order n Vandermonde matrix using the n-th roots of unity. 
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3 Bratteli diagrams, Gerfand-Tsetlin bases, and adapted diameters 



The main ingredients for our algorithm are (i.) a tower of subgroups (or chain) which provides a means by which the 
Fourier transform on G can be buih iteratively as an accumulation of Fourier transforms on increasingly larger sub- 
groups and (ii.) a natural indexing scheme for the representations given by paths in the Bratteli diagram corresponding 
to the group tower and finally (Hi.) a factorization of group elements in terms of a basic set of generators, which, when 
judiciously chosen, provide a factorization of the Fourier transform as a product of structured (direct sums of tensor 
products) and sparse matrices. The complexity of a corresponding efficient Fourier transform which uses these basic 
ingredients can then be derived in terms of basic representation-theoretic and combinatorial data. 



3.1 Bratteli diagrams and Gel'fand-Tsetlin bases 



Much of Abelian Fourier analysis is simplified by the fact that 
in this case the dual (that is, the set of characters % : G C) 
also forms a group isomorphic to the original group; further- 
more, in this isomorphism lies a natural correspondence provid- 
ing an indexing of the irreducible representations (i.e., matrix el- 
ements). However, in the general case there is no immediate in- 
dexing scheme for the dual G and the landscape is further com- 
plicated by the absence of a canonical basis for the now mul- 
tidimensional representations. Indeed, for the goal of efficient 
Fourier analysis, not all bases are created alike! In particular, a 
fairly general methodology for the construction of group FFTs, 
the "separation of variables" approach 1171 1181 reUes on the use 
of Gel'fand-Tsetlin or adapted bases for efficient computation. 
These bases allow for a Fourier transform on G to be built from 
Fourier transforms on subgroups, a general technique whose effi- 
ciency improves as it is used through a tower of subgroups. This is 
in fact the main idea in the famous "Cooley-Tukey" (decimation- 
in-time) FFT. 

A crucial ingredient of the general separation of variables ap- 
proach is the incorporation of an indexing scheme that permits the 
computational to be organized efficiently. The same Bratteli dia- 
gram formalism is key to both the organization and manipulation 
of the calculation for a quantum FFT; we present it below. 

Given a finite group G and let 

G = Gin > Gm-l > • • • > Gi > Go = {1} 




Figure 1 : These are the Bratteli diagrams for the sub- 
group towers > Z3 > 1 (top) and 54 > 53 > 52 > 1 
(bottom). Cyclic groups of order n have representations 
indexed by the integers mod n, and (assuming m\n) then 
the representation corresponding to j restricts to the 
representation corresponding to j mod m . The lower 
diagram uses the well-known correspondence between 
irreducible representations of 5„ and partitions of n. In 
this case restrictions from S„ to 5„_i are determined by 
those partitions obtained via the decrement of a part of 
the original partition. 



be a tower of subgroups of length m for G. The corresponding 
Bratteli diagram, denoted 05, is a leveled directed multigraph 
whose nodes of level i = Q, . . . ,m are in one-to-one correspon- 
dence with the (inequivalent) irreducible representations of G,. For convenience, we refer to vertices in the diagram by 
the representation with which they are associated. The number of edges from an irreducible representation r| of G, to 
p of G,+i is equal to the multiplicity of r| in the restriction of p to G,-. Since there is a unique irreducible representation 
of the trivial group, a Bratteli diagram for a given tower is in fact a rooted tree. 

Thus, the edges out of a node r| of G, represent a complete set of orthogonal embeddings of the correspond- 
ing representation space into the representations of G,+ i and conversely, the edges entering a given representation 
p : G,+i U (Vp) of G,+i index a set of mutually orthogonal subspaces of Vp whose direct sum represents the decom- 



position of Vp under the (restricted) action of G,. Thus, the paths from the root node to a vertex p : G, 



index 



a basis of Vp with the following property: for any Gj < Gi, there is a partition of the basis vectors into subsets, each 
of which spans an irreducible G^-invariant subspace, so that the associated matrix representation is block diagonal 
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according to this partition when restricted to Gj and, moreover, that blocks for equivalent irreducible representations 
are actually equal. Such bases are said to be (subgroup-)adapted or Gel'fand-Tsetlin. Consequently, the number of 
paths to a node r| is equal to d^, and pairs of path with common endpoint r| index an irreducible matrix element of r|. 

The block diagonal nature of the restriction (combined with the fact that blocks corresponding to equivalent repre- 
sentations are actually equal) allows the Fourier transform on G = G„, to be expressed as a sum of Fourier transforms 
on G,„_i, each translated from a distinct coset: specifically, if T C G is a transversal, i.e. a set of representatives for 
the left cosets of G„,_i in G,„, we define fa '■ Gm-i ^ C by fa{x) — /(ott). Then 

/(p)=£p(a) pW/(ctt)= i:p(a)./a(plG„_,)- (2) 
3.2 Strong generating sets and adapted diameters 

Adapted representations are only part of the story for the construction of efficient Fourier transform algorithms. In 
general, p(a) of Equation (|2}, the "twiddle factor", could be an arbitrary matrix of exponential size, so implementing 
it in (13 could be costly. Luckily, under fairly mild assumptions, the matrices p(a) can be factored into polylog(|G|) 
sparse, highly structured matrices, and can therefore be implemented with polylog(|G|) elementary quantum operators. 

We say that 5 is a strong generating set for the tower of subgroups {G,} if 5 n G, generates G,. Say that we have 
chosen a transversal T, for each / indexing the cosets of G,_i in G,. Now define D, — min{£ > : Uy<f (5nG,)^ 3 7)}, 
and define the adapted diameter D — Then clearly any group element can be factored as a series of coset 

representatives, which in turn can be factored as a total of at most D elements of S. 

Of course, to perform the QFT efficiently we would like p (y) to have a simple form for each y E S. Given a 
subgroup K <G, recall that the centralizer of K is the subgroup Z{K) = {g € G : gk = kg for all k e K}. The following 
is implicit in the oft-cited lemma of Schur: 

Lemma 1 (Schur, M7\ Lemma 5.1]) Let K < G, letjE Z{K), and let p be a K-adapted representation ofG. Suppose 
that p|^ = ®'"'rii • ■ • ®'"'"ri,.. Then p(y) has the form 

{GL„,^ (C) ® /rf, ) © • • • ® (GL„,, (C) ® /rf,) (3) 

where Ii^ is the k x k identity matrix and di = dr^.. 

Since any unitary operator in GL,„(C) can be carried out with poly(OT) elementary quantum gates \2i, and since we 
can condition on the r|,- to find out which subspace of p we are in, we can write p(y) as a series of poly(M) elementary 
quantum operations where M — max,- m,- in Therefore, the total number of elementary quantum operators we need 
to implement p(a) is then D x poly(M). 

Moreover, if y is itself in a subgroup H > K, and p is adapted to both H and K, then p{a) also possesses the 
block structure corresponding to p|^. This places an upper bound on M of the maximum multiplicity with which 
representations of K appear in restrictions of representations of H. Thus we can minimize M by choosing generators 
Y inside subgroups as low on the tower as possible, which centralize subgroups as high on the tower as possible. 

For instance, in the symmetric group S„ we take the tower to be S„ > 5„_i > • • • > {!}, where 5, fixes all elements 
greater than /. Let S be the set of pairwise adjacent transpositions {j,j + 1); each of these is contained in Sj+i and 
centralizes Sj-i. The maximum multiplicity with which a representation of appears in a representation of Sj+i 
is 2, corresponding to the two orders in which we can remove two cells from a Young diagram. Since the adapted 
diameter is easily seen to be 0{n^), this means that the p(a) can be carried out in 0{n^) = polylog(|5„|) elementary 
quantum operations 0. We will see that a similar situation obtains for a large class of groups. 

4 Efficient quantum Fourier transforms 

We describe our algorithm in this section. The algorithm performs the Fourier transform inductively on the tower of 
subgroups, using the structure of the Bratteli diagram to construct the transform at each level from the transform at the 
previous level. 
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Recall that for each level of our tower of subgroups G = G,„ > G„,_ i > • • • > Go = { 1 } we have chosen a transversal 
Tj for the left cosets of G,_i in G,-. At the beginning of the computation, we represent each group element g as a product 
a = ttm • • • ai where a, e 7). This string becomes shorter as we work our way up the tower, and after having performed 
the Fourier transform for G, the remaining string a = a,„ • • • a,+i indexes the coset of G, in G in which g lies. 

At the end of the computation, we have a pair of paths in the Bratteli diagram, s = si • • • 5„, and f = f i ■ ■ ■ f„,, which 
index the rows and columns of the representations p of G. These paths begin empty and grow as we work our way 
up the tower; after having performed the Fourier transform for G,, the paths p ^ pi ■ ■ ■ Pi and q = qi ■ ■ ■ qt of length ; 
index the rows and columns of representations o of G, . 

With a compact encoding, one could store a in the same registers as s and f, at each step replacing a coset 
representative a,- with a pair of edges However, our algorithm is simpler to describe if we double the number of 
qubits and store a and s,t in separate registers. Padding out a, s, and t to length m with zeroes, our computational 
basis consists of unit vectors of the form 

|a) \s,t) = |a„, ■ • -a^+i 0') ® \si ■ ■■SiO"'-\si ■ ■ -s/O'"-') . 

Keep in mind the basis { ji, ?) }, where s and t have length / and end in the same representation, is just a permutation of 
our adapted Gel'fand-Tsetlin basis { |o, y, k) } for G,, where O ranges over the representations of G/ and 1 < j,k <da 
index its rows and columns. Therefore, we will sometimes abuse notation by writing f{s,t) and f{o)j,k for the Fourier 
transform over G, indexed in these two different ways. 

Each stage of the algorithm consists of calculating the Fourier transform over G,+i from that over G, . By induction 
it suffices to consider the last stage, where we go from H = G^-i to G = G„,. Specifically, choose a transversal T of 
H mC such that every g ^ G can be written ah where a G T and h ^ H. For each a G T, define a function fa on H as 
fa{h) = f{<xh); this is the restriction of / to the coset aH, shifted into H. 

After having performed the Fourier transform on H, our state will be 

L |a) eg) fa{s,t)\s,t) = ^ |a) (x) ^ fa{a)j^k\a,j,k) . (4) 

aeT i,r of length m-1 aeT {a.j.k)eH 

Our goal is to transform this state into the Fourier basis of G, namely 

\0)<S> £ f{s,t)\s,t) ^ \Q) ® fiP)j,k\PJ^'^) ■ (5) 

i,Ioflengthm {pJ.k)eG 

where |0) occupies the register that held the coset representative a before. 

This transformation is greatly simplified by the following two observations, which are common to nearly every 
algorithm for the FFT. First, as described in Equation (|2} above, / can be written as a sum over contributions from /'s 
values on each coset aH, giving 

/(P)= Lp(a) •/«(?) • (6) 

aeT 

Since fa has support only in H, the matrix /a(p) is a direct sum of sub-matrices of the form fa{<^), summed over the 
O appearing in p. In the quantum setting we accomplish this via an embedding operation which reverses the restriction 
to H, 

E VpIp) (7) 

p:0 appeal's in p|^ 

where this "scale factor" is 

A -./Mi 

(Note that Ip |Ao,p|2 = l.) 

Thus the algorithm consists of (i.) embedding the a in the appropriate p, (ii.) applying the "twiddle factor" p(a), 
and (Hi.) summing over the cosets. However, in general, doing these things efficiently is no simple matter First, a 
given c might appear in a given p with an arbitrary change of basis; the twiddle p(a) could be an arbitrary unitary 
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matrix of exponential size; and summing over an exponential number of cosets will take exponential time unless 
parallelized in some way. 

It is here that the Bratteli diagram proves to be extremely helpful. It allows us to implement the twiddle factors 
p(a) efficiently when coupled with a strong generating set as discussed in Section l3T2l bv providing an adapted basis. 
It simplifies the embedding operation as well: first note that fa{s,t) is nonzero only when s and t end in the same 
representation o of Gt, i.e. in the same vertex of the diagram. Moreover, recall that the Bratteli diagram indexes an 
adapted basis in which is block-diagonal with the (5j as its blocks. This means that the a appear in the p in an 
extremely simple way: namely, where s and t are extended by appending the same edge e to both. 

Let adopt some notation. Given a path s in the Bratteli diagram of length m — 1 or m, denote the representation in 
which it ends by a[i] or p[i] respectively, and if i = ii • • -im-i, denote s\ ■ ■ -Sm-ie as se. We will index the edges of 
each vertex { 1 , . . . , A:} where it has out-degree k. It will be convenient to carry out this embedding only if the register 
containing the coset representative is zero, and leave other basis vectors in [T U {0}) (g)^ fixed. Then becomes 

J J . f |0> \s,t) ^ |0)I,A„[,],p[,,] \se,te) 

\ |a)|i,f)->|a)|i,f) for alia er ^ ' 

where the sum is over all outgoing edges e of o[i] = (5\s\. 

Note that we have not defined U on the entire space; in particular, since we are moving probability from H to G, 
basis vectors |0) \se^te) G (T U {0}) ® G cannot stay fixed. As we will see below, it does not matter precisely how U 
behaves on the rest of the state space, as long as its behavior on H is as described in (|8}. This can be accomplished 
simply by putting the m'th registers of s and t in the superposition Le^o[.i],p[.ve] k) ® k)' ™d for a large class of 
extensions we can prepare this superposition efficiently. 



4.1 Extensions of subexponential index 

In this section we generalize Beals' QFT for the symmetric group (3] to a large class of groups. First we show that the 
Fourier transform can be extended from H to G, modulo some reasonable uniformity conditions on G. 

Definition 2 For a group G and a tower of subgroups G;, let 58 be the corresponding Bratteli diagram, let Ti be a set 
of coset representatives at each level, and let S be a strong set of generators for G. Then we say that G is polynomially 
uniform (with respect to {G;}, 5B, {Ti}, and S) if the following functions are computable by a classical algorithm in 
polylog(|G|) time: 

1. Given two paths s,t in !B, whether p[s\ — p[f]; 

2. Given a path s in 25, the dimension and the out-degree ofp[s\; 

3. Given a coset representative a,- G 7], a factorization ofa as a wort/ of poly log(|G|) length in (5nG,)*. 

Lemma 2 If G is polynomially uniform with respect to a tower of subgroups where G = Gm and H — G„,_i and a 
strong generating set S with adapted diameter D and maximum multiplicity M, then the Fourier transform of G can 
obtained from the state @ using poly([G : H] x D x M x log |G|) elementary quantum operations. 

Proof. First, to carry out the embedding transformation U , we use the classical algorithm to compute the list of edges 
e and d^^^e] conditional on s, and thus compute the A^.p (say, to n digits in poly(n) time). Note that a appears in at 
most [G : H] many p. We then carry out a series of [G : H] conditional rotations, each of which rotates the appropriate 
amplitude from |0) |i,f) to |0) \se^te). Thus [/, and therefore t/^', can be carried out in 0{[G : H]) quantum operations. 

To apply the twiddle factor and sum over the cosets as in (|6|l, we use a technique of Beals 1 3 1 and carry out the 
following for-loop. For each a G T, we do the following three things: left multiply /(p) by p(a)^^; add /a(p) to 
/(p); and left multiply /(p) by p(a). This loop clearly produces Laer P(^) ' /(P)' we just need to show that each 
of these three steps can be carried out efficiently. 

Recall that /(p) is given in the \s^t) basis, where s and f index the row and column of p respectively. To left 
multiply /(p) by p(oc), we apply p(a) to the s register and leave the f register unchanged. Since G is polynomially 
uniform, a classical algorithm can factor a as the product of D generators y, G S, and provide a factorization of each 
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P(Y() as the product of poly(M) many elementary quantum operations, in polylog(|G|) time. This implements p(a) 
and p(a)^^ in D x poly(M) +polylog(|G|) operations. 

The step "add /a(p) to /(p)" is slightly more mysterious, and indeed it does not even sound unitary at first. 
However, as Beals points out, at each point in the loop we are adding /a(p), which is the Fourier transform of a 
function with support only on H, to L|3<aP(c'' 'P)/p(P)' which is the Fourier transform of a function with support 
only outside H. Thus these two states are orthogonal, and adding two orthogonal vectors can be done unitarily 
by rotating one vector into the other while fixing the subspace perpendicular to both. Let Va be the operation that 
exchanges |a) |i,f) with |0) |i,f) and leaves |P} |i,f) fixed for all P < a,0; then Beals showed that this step can be 
written f/^'Vaf/ where U is the embedding operator defined in (|8j. We showed earlier that U can be carried out in 
0{[G : H]) quantum operations, and V is a simply a Boolean operation on the a register Finally, the for-loop runs 
r| = [G: H] times, so we're done. □ 

Proof of Theorem]!} This follows immediately from the fact that the depth of the Bratteli diagram is at most log \G\. 
□ 

As noted above, for many groups, the maximum index / = max,[G, : G,_i], the adapted diameter D, and the 
maximum multiplicity M are all polylog(|G|). In this case, Theorem[2gives circuits for the QFT of polylog(|G|) size. 
This includes the following three families of groups: 

The symmetric groups S„. As stated above, we take the tower S„ > S„-i > ■ ■ ■ > {1} where Si fixes all elements 
greater than /. The maximum index is then n = o(log \S„\). The generators are the adjacent transpositions; the adapted 
diameter is 0{rp-) and the maximum multiplicity is 2. The adapted basis is precisely the Young orthogonal basis. 

Wreath products G = HlS„ for H of size poly (n). These groups arise naturally as automorphism groups of graphs 
obtained by composition [12] . As in [23 1 the tower is 

HlSn >Hy.{HlS„-i) > HlS„^i > ••• > {1} . 

The maximum index is max(«,|//|), the generators are the adjacent transpositions and an arbitrary set of log|//| 
generators for each factor of H, the adapted diameter is (9(n^log \H\), and the maximum multiplicity is 0{\H\). Then 
note that \H\ = polylog(|G|). See 1171 for details and 1151 for discussion on wreath products. 

The CUfford groups. The Clifford groups CL„ are generated by xi,...,x„ where xj = 1 and XiXj = —XiXj for 
all / ^ j |26|. We take the tower CL„ > CL„_i > ■■• > {1} which has maximum index 2, and the generators 
{{jci}, 1^:1x2}, {x„_ix„}}. The adapted diameter is 0{n), and since each centralizes CL,_ 1 , the maximum 

multiplicity is 4. 

In addition to giving polylog(|G|)-size circuits for these groups, this technique also gives the first subexponential- 
size circuits for the following classical groups: 

The linear groups GLn{q), SL„(^), PGL„(^), and PSL„(^); the finite groups of Lie type; the 
Chevalley and Weyl groups. The case of GL„{q) is emblematic of all these families. We have a 
natural tower: 

GL„(^) > P„(^) > GL„_i(<?) xGLi(^) > GU-M > {1} • 

Here ^k{q) is the so-called maximal parabolic subgroup of the form shown in Figure 2, where A £ 
GLk-i{q)^v G F^^^ and c e F^^. Our generators are block-diagonal with an arbitrary element of 
GL2 {q) in the /, / — 1 block and all other diagonal elements equal to 1 . The adapted diameter is 0{n^), 
the maximum index is q"^^, and the maximum multiplicity is q'^^"\ Analogous factorizations arise in the case of the 
finite groups of Lie type as well as the finite unitary groups 1 1 8 1 . 

Theorem[2then implies a quantum circuit of size (j''^'"' for the QFT over these groups. Since |G| = 0{q"') we 
can write this as |G|'^''/"\ which is exp((9(-\/log |G|)) if q is fixed. Note that the best-known classical algorithm for 
these groups (TT\ has complexity |G|^®(") = G^+&(i/")- therefore, we argue that this quantum speedup is the most we 
could expect relative to the existing classical algorithm. Note, for instance, that for the group families above for which 
we obtain circuits of size polylog(|G|), there are classical algorithms of complexity |G|polylog(|G|). In both cases it 
appears that the natural quantum speedup is to remove a factor of |G| (modulo polylogarithmic terms). 
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4.2 Extensions of exponential index and Coppersmith-type circuits 

The reader familiar with Coppersmith's circuit Q for the QFT over G = Z2", where H = Zyi-i, will recall that the 
Hadamard gate embeds a character a € H in two characters p G G, applies part of the twiddle factor, and sums over 
the two cosets of H, all in one operation. This is in contrast to Beals' technique, which sums over the cosets serially. 
Indeed, if the index [G : H] is exponential — for instance, if G is an extension of H by Zp where p is exponentially 
large — then Beals' technique takes exponential time. 

For a certain type of extension, we can construct circuits analogous to Coppersmith's, which use quantum paral- 
lelism to embed a in the p, sum over all p cosets simultaneously, and apply the twiddle factor as well. Recall that G 
is a split extension or semidirect product of H by T, written T t< H, if H <lG and there is a transverse subgroup T <G 
so that T = G/H. 

Definition 3 Suppose G is a split extension of H by T, and let S be a set of at most log2 |r| generators for T, and 
suppose that G is polynomially uniform with respect to a tower of subgroups where G = Gm and H — G,„_i and a 
Bratteli diagram 5B. Then G is a homothetic extension ofH by T if 

1. Given O E H and J E S, define o'^{h) = o{y^^hy}. Then for every O E H, either o"^ = O, or the orbit of q distinct 
representations d^\for < j < q where q divides the order of J, appears among the representations ofH given 
by'B. 

2. For each JE S, there is a classical algorithm which runs in polylog(|G|) time which, given a path s in 03 indexing 
a row ofa[s] and an integer j, returns the size q ofo's orbit under conjugation by y, and returns a path s~^^ that 
indexes the same row ofo[s'^^] = o"^' . 

Theorem 2IfG is a homothetic extension ofH by an Abelian group, then the Fourier transform of G can be obtained 
from the state using polylog(|G|) elementary quantum operations . 

Proof. It is easy to show that a homothetic extension of H hy A x B consists of a homothetic extension of H by A, 
followed by a homothetic extension by B. Therefore it suffices to prove the lemma for homothetic extensions by cyclic 
groups of prime power order, so let T be generated by y of order p^. 

We recall some representation theory from l6l 1221 . Given a E H, the stabilizer of a is K = {x E T . &' = a}, and 
for a homothetic extension we can replace = a with a" ~ a. Then K is the subgroup of T of order generated by 
y where q = p^-^^, and a's orbit under conjugation by y is of size q. 

The representations p in which a appears can be obtained in two steps. First, we extend atoKxHhy multiplying 
O by one of the p^ characters of K. This yields it EKkH where T/,(y^-'7i) = Xh{j)<y{h) and XhiY'^) = ^^{- Since 



p' 



dx,, = da, we have Ao.t,^ = vQp^ '^^'^ ^ embeds in a uniform superposition over the Xh, so we append a uniform 
superposition of edges 1 < e < where b = e — I. Combining this with the twiddle factor gives the unitary 
transformation 

pi 

Y?-'-+'^)Kr)-|/)®4^E«-r"'k,fe> ■ (9) 



'F e=l 

Here we write the power of y in two registers < j < p^ and < k < q. Then this operation Fourier transforms the 
first register over Z^f and transfers the result to the mth register of s and t. This transform can be carried out with 
O (log log log p^) — <9(log |G|loglog |G|) elementary operations I10II16I . Note that p^ takes at most log |G| different 
values, and can be obtained from the classical algorithm which computes q. 

IfK — T, then the p G G containing a are simply the extensions Z[, and we're done. If K <T, i.e. if ^ > 1, we carry 
out a second step as follows. Each Xi, appears in a single induced representation p^ whose restriction to K x H is the 
direct product of all the representations in a's orbit, times Xt,: that is, p^j^ = Xh ©Lo twiddle factor P/,(y^) 

is then a permutation matrix which cycles these p blocks k times, with an additional phase change co*;. This gives the 
unitary transformation 



A\se,te)^^y>'\Q) 



s~^e,te) . (10) 
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Since s"^ can be calculated by the classical algorithm in polylog(|G|) time, and since it is easy to implement cOpf with 
phase shifts (O^J' for < y < log2 k conditioned on the binary digit sequence of bk, we can perform this operation in 
polylog(|G|) quantum steps. Composing (|9j and ilQ\ transforms the state © to the Fourier transform ^ over G. □ 

Closure under homothetic extensions and the metacyclic groups. Theorem 121 shows that the set of groups for 
which circuits of polylog(|G|) size exist is closed under homothetic extensions by Abelian groups. It also generalizes 
the efficient quantum Fourier transform of H0yer 1131 for the metacyclic groups k Zp, since these are homothetic 
extensions of Zp by 'Eg- Note that the metacyclic groups include the dihedral groups (where q = 2) and the affine 
groups (where q — p—l)as special cases. 

The general case. In general, Abelian extensions can be slightly more complicated; consider extensions by Zp. If 
is isomorphic to O, rather than equal to it, y induces an additional twiddle factor C(y) which changes o's basis 1221 . 
This occurs, for instance, if yP is an element of H other than the identity, in which case the cyclic group generated by 
y is not transverse to H and the extension is not split. In this case C(y) is a p'th root of CJ(y'^). 

Relation to Coppersmith's circuit. Let y be a generator of G = Z2n. Then G is an extension of H = TL^"-^ with 
transversal {l,y}. Since y^ 7^ 1, y induces an additional phase shift C(y) = vaaC?) = (Similarly, the additional 
phase shift in (I10> is due to the fact that Zp; is not a split extension of Z^f.) In Coppersmith's circuit, C(y) appears 
as a set of phase shift gates conditional on the low-order bit of j. Finally, the Hadamard gate in Coppersmith's circuit 
is precisely the operation (|9} in the case p — 2, I — \ and ^ = 1, and where we use the same qubit register for e (the 
high-order bit of the frequency) as for a (the low-order bit of the time). 

The quaternionic groups. Another example is the generalized quaternion group, which is an extension of H = Z2,, 
by Z2 where y^ is the element of order 2 in H. Then C(y) — y/<y{"f-) = 1 or /. Piischel, Rotteler and Beth 1 21 1 gave an 
efficient quantum Fourier transform for these groups in the case where n is a power of 2. Of course, these groups are 
extensions of Abelian groups with bounded index, so Lemma|2]already provides an efficient QFT for them. 

MetabeUan groups. Even if an extension is neither homothetic nor of polynomial index, we can still construct an 
efficient QFT if we can apply arbitrary powers of C(y) in polynomial time. This is true, for instance, if C(y) is 
of polynomial size, which is true whenever all the representations of H are of polynomial size. This includes the 
metabelian groups, i.e. split extensions of Abelian groups by Abelian groups, since all the representations of H are 
one-dimensional. We discuss this further in the full paper 

5 Conclusion and open problems 

The separation of variables is in essence a coarse scale use of a factorization of the dual, using blockwise redundancy as 
well as sparseness. It is possible to use the Bratteli diagram indexing and accompanying path factorizations in a more 
precise fashion, effectively looking for redundancy and sparsity on the level of individual elements. This finer analysis 
is responsible for the fastest known classical FFTs for the groups Sl^2{q), as well as Sn and its wreath products 1 19 1. It 
would be interesting to investigate the possibility of adapting these techniques to the quantum setting. 
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